Using Abstract Models of Behaviours to Automatically Generate Reinforcement Learning Hierarchies
نویسنده
چکیده
In this paper we present a hybrid system combining techniques from symbolic planning and reinforcement learning. Planning is used to automatically construct task hierarchies for hierarchical reinforcement learning based on abstract models of the behaviours’ purpose, and to perform intelligent termination improvement when an executing behaviour is no longer appropriate. Reinforcement learning is used to produce concrete implementations of abstractly defined behaviours and to learn the best possible choice of behaviour when plans are ambiguous. Two new hierarchical reinforcement learning algorithms are presented: Planned Hierarchical Semi-Markov Q-Learning (P-HSMQ), a variant of the HSMQ algorithm (Dietterich, 2000b) which uses plan-built task hierarchies, and TeleoReactive Q-Learning (TRQ) a more complex algorithm which implements hierarchical reinforcement learning with teleo-reactive execution semantics (Nilsson, 1994). Each algorithm is demonstrated in a simple grid-world domain.
منابع مشابه
Efficient Reinforcement Learning with Hierarchies of Machines by Leveraging Internal Transitions
In the context of hierarchical reinforcement learning, the idea of hierarchies of abstract machines (HAMs) is to write a partial policy as a set of hierarchical finite state machines with unspecified choice states, and use reinforcement learning to learn an optimal completion of this partial policy. Given a HAM with deep hierarchical structure, there often exist many internal transitions where ...
متن کاملHierarchical Reinforcement Learning: A Hybrid Approach
In this thesis we investigate the relationships between the symbolic and subsymbolic methods used for controlling agents by artificial intelligence, focusing in particular on methods that learn. In light of the strengths and weaknesses of each approach, we propose a hybridisation of symbolic and subsymbolic methods to capitalise on the best features of each. We implement such a hybrid system, c...
متن کاملSpeeding Up HAM Learning with Internal Transitions
In the context of hierarchical reinforcement learning, the idea of hierarchies of abstract machines (HAMs) is to write a partial policy as a set of hierarchical finite state machines with unspecified choice states, and use reinforcement learning to learn an optimal completion of this partial policy. Given a HAM with potentially deep hierarchical structure, there often exist many internal transi...
متن کاملRL-TOPS: An Architecture for Modularity and Re-Use in Reinforcement Learning
This paper introduces the RL-TOPs architecture for robot learning, a hybrid system combining teleo-reactive planning and reinforcement learning techniques. The aim of this system is to speed up learning by decomposing complex tasks into hierarchies of simple behaviours which can be learnt more easily. Behaviours learnt in this way can subsequently be re-used to solve a variety of problems, redu...
متن کاملPartial Order Hierarchical Reinforcement Learning
In this paper the notion of a partial-order plan is extended to task-hierarchies. We introduce the concept of a partial-order taskhierarchy that decomposes a problem using multi-tasking actions. We go further and show how a problem can be automatically decomposed into a partial-order task-hierarchy, and solved using hierarchical reinforcement learning. The problem structure determines the reduc...
متن کامل